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ABSTRACT 


In  this  paper  we  introduce  a  new  paradigm.  Random  Sample  Consensus 
(RANSAC),  for  fitting  a  model  to  experimental  data.  RANSAC  is  capable 
of  interpreting/smoothing  data  containing  a  significant  percentage  of 
gross  errors,  and  thus  is  ideally  suited  for  applications  in  automated 
image  analysis  where  interpretation  is  based  on  the  data  provided  by 
error-prone  feature  detectors.  A  major  portion  of  this  paper  describes 
the  application  of  RANSAC  to  the  Location  Determination  Problem 
(LDP):  given  an  image  depicting  a  set  of  landmarks  with  known 
locations,  determine  that  point  in  space  from  which  the  image  was 
obtained.  In  response  to  a  RANSAC  requirement,  we  derive  new  results  on 
the  minimum  number  of  landmarks  needed  to  obtain  a  solution,  and  present 
algorithms  for  computing  these  minimum-landmark  solutions  in  closed 
form.  These  results  provide  the  basis  for  an  automatic  system  that  can 
solve  the  LDP  under  difficult  viewing  and  analysis  conditions. 
Implementation  details  and  computational  examples  are  also  presented. 


ii 


I  INTRODUCTION 


In  this  paper  we  introduce  a  new  paradigm.  Random  Sample  Consensus 
(RANSAC) ,  for  fitting  a  model  to  experimental  data;  and  we  illustrate 
its  use  in  scene  analysis  and  automated  cartography.  The  application 
discussed,  the  location  determination  problem  (LDP),  is  treated  at  a 
level  beyond  that  of  a  mere  example  of  the  use  of  the  RANSAC  paradigm; 
we  present  new  basic  findings  concerning  the  conditions  under  which  the 
LDP  can  be  solved,  and  describe  a  comprehensive  approach  to  the  solution 
of  this  problem  that  we  anticipate  will  have  near-term  practical 
applications . 

To  a  large  extent,  scene  analysis  (and  in  fact,  science  in  general) 
is  concerned  with  the  interpretation  of  sensed  data  in  terms  of  a  set  of 
predefined  models.  Conceptually,  interpretation  involves  two  distinct 
activities:  first,  there  is  the  problem  of  finding  the  best  match 
between  the  data  and  one  of  the  available  models  (the  classification 
problem);  second,  there  is  the  problem  of  computing  the  best  values  for 
the  free  parameters  of  the  selected  model  (the  parameter  estimation 
problem).  In  practice,  these  two  problems  are  not  independent — a 
solution  to  the  parameter  estimation  problem  is  often  required  to  solve 
the  classification  problem. 

Classical  techniques  for  parameter  estimation,  such  as  "least 
squares,"  optimize  (according  to  a  specified  objective  function)  the  fit 
of  a  functional  description  (model)  to  ALL  of  the  presented  data.  These 
techniques  have  no  internal  mechanisms  for  detecting  and  rejecting  gross 
errors.  They  are  averaging  techniques  that  rely  on  the  assumption  (the 
"smoothing  assumption")  that  the  maximum  expected  deviation  of  any  datum 
from  the  assumed  model  is  a  direct  function  of  the  size  of  the  data  set, 
and  thus  regardless  of  the  size  of  the  data  set,  there  will  always  be 
enough  "good"  values  to  "smooth  out"  any  gross  deviations. 

In  many  practical  parameter  estimation  problems  the  smoothing 
assumption  does  not  hold;  that  is,  the  data  contains  uncompensated  gross 
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errors.  To  deal  with  this  situation,  a  number  of  heuristics  have  been 
proposed.  The  technique  usually  employed  is  some  variation  of  the  idea 
of  first  using  all  the  data  to  derive  the  model  parameters;  next,  locate 
the  datum  that  is  farthest  from  agreement  with  the  instantiated  model, 
assume  that  it  is  a  gross  error,  delete  it,  and  iterate  this  process 
until  either  the  maximum  deviation  is  less  then  some  preset  threshold, 
or  until  there  is  no  longer  sufficient  data  to  proceed. 

It  can  easily  be  shown  that  a  single  gross  error  ("poisoned 

point"),  mixed  in  with  a  set  of  good  data,  can  cause  the  above  heuristic 

to  fail  (for  example,  see  Figure  1).  It  is  our  contention  that 

averaging  is  not  an  appropriate  technique  to  apply  to  an  "unverified" 
data  set. 

In  the  following  section  we  introduce  the  RANSAC  paradigm,  which  is 
capable  of  smoothing  data  that  contains  a  significant  percentage  of 
gross  errors.  This  paradigm  is  particularly  applicable  to  scene 
analysis  because  local  feature  detectors,  which  often  make  mistakes,  are 
the  source  of  the  data  provided  to  the  interpretation  algorithms .  Local 
feature  detectors  make  two  types  of  errors-- c lassif ication  errors  and 

measurement  errors.  Classification  errors  occur  when  a  feature  detector 
incorrectly  identifies  a  portion  of  an  image  as  an  occurrence  of  a 
feature.  Measurement  errors  occur  when  the  feature  detector  correctly 
identifies  the  feature,  but  slightly  miscalculates  one  of  its  parameters 
(e.g.,  its  image  location).  Measurement  errors  generally  follow  a 
normal  distribution,  and  therefore  the  smoothing  assumption  applies  to 
them.  Classification  errors,  however,  are  gross  errors  because  they 
have  a  significantly  larger  effect  than  measurement  errors  and  they  do 
not  average  out. 

In  the  final  sections  of  this  paper  we  discuss  the  application  of 
RANSAC  to  the  location  determination  problem: 

Given  a  set  of  "landmarks"  ("control  points"),  whose 
locations  are  known  in  some  coordinate  frame,  determine 
the  location  (relative  to  the  coordinate  frame  of  the 
landmarks)  of  that  point  in  space  from  which  an  image  of 
the  landmarks  was  obtained. 
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In  response  to  a  RANSAC  requirement,  we  first  derive  some  new 
results  on  the  minimum  number  of  landmarks  needed  to  obtain  a  solution, 
and  then  present  algorithms  for  computing  these  minimum-landmark 
solutions  in  closed  form.  (Conventional  techniques  are  iterative  and 
require  a  good  initial  guess  to  assure  convergence.)  These  results  form 
the  basis  for  an  automatic  system  that  can  solve  the  LDP  under  severe 
viewing  and  analysis  conditions.  In  particular,  the  system  performs 
properly  even  if  a  significant  number  of  the  landmarks  are  incorrectly 
located  due  to  low  visibility,  terrain  changes,  or  image  analysis 
errors.  Implementation  details  and  experimental  results  are  presented 
to  complete  our  description  of  the  LDP  application. 


II  RANDOM  SAMPLE  CONSENSUS 

The  philosophy  of  RANSAC  is  opposite  to  that  of  conventional 
smoothing  techniques — rather  than  using  as  much  of  the  data  as  possible 
to  obtain  an  initial  solution  and  then  attempting  to  eliminate  the 
invalid  data  points,  RANSAC  uses  as  small  an  initial  data  set  as 
feasible  and  enlarges  this  set  with  consistent  data  when  possible.  For 
example,  given  the  task  of  fitting  an  arc  of  a  circle  to  a  set  of  two- 
dimensional  points,  the  RANSAC  approach  would  be  to  select  a  set  of 
three  points  (since  three  points  are  required  to  determine  a  circle), 
compute  the  center  and  radius  of  the  implied  circle,  and  count  the 
number  of  points  that  are  close  enough  to  that  circle  to  suggest  their 
compatibility  with  it  (i.e. ,  their  deviations  are  small  enough  to  be 
measurement  errors).  If  there  are  enough  compatible  points,  RANSAC 
would  employ  a  smoothing  technique,  such  as  least  squares,  to  compute  an 
improved  estimate  for  the  parameters  of  the  circle  now  that  a  set  of 
mutually  consistent  points  has  been  identified. 

The  RANSAC  paradigm  is  more  formally  stated  as  follows: 

Given  a  model  that  requires  a  minimum  of  n  data  points  to 
instantiate  its  free  parameters,  and  a  set  of  data  points  P 
such  that  the  number  of  points  in  P  is  greater  than  n 
(#(P)  >  n),  randomly  select  a  subset  SI  of  n  data  points 
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from  P  and  Instantiate  the  model.  Use  the  instantiated 
model  Ml  to  determine  the  subset  SI*  of  points  in  P  that 
are  within  some  error  tolerance  of  Ml.  The  set  SI*  is 
called  the  consensus  set  of  SI. 

If  it  (SI*)  is  greater  than  some  threshold  t,  which  is  a 
function  of  the  estimate  of  the  number  of  gross  errors  in  P, 
use  SI*  to  compute  (possibly  using  least  squares)  a  new 
model  Ml*. 

If  (S 1* )  is  less  than  t,  randomly  select  a  new  subset  S2 
and  repeat  the  above  process.  If,  after  some  predetermined 
number  of  trials,  no  consensus  set  with  t  or  more  members 
has  been  found,  either  solve  the  model  with  the  largest 
consensus  set  found,  or  terminate  in  failure. 

There  are  two  obvious  improvements  to  the  above  algorithm:  first, 
if  there  is  a  problem-related  rationale  for  selecting  points  to  form  the 
S's,  use  a  deterministic  selection  process  instead  of  the  random  one; 
second,  once  a  suitable  consensus  set  S*  has  been  found  and  a  model  M* 
instantiated,  add  any  new  points  from  P  that  are  consistent  with  M*  to 
S*  and  compute  a  new  model  on  the  basis  of  this  larger  set. 

The  RANSAC  paradigm  contains  three  unspecified  parameters:  (1)  the 
error  tolerance  used  to  determine  whether  or  not  a  point  is  compatible 
with  a  model,  (2)  the  number  of  subsets  to  try,  and  (3)  the  threshold  t, 
which  is  the  number  of  compatible  points  used  to  imply  that  the  correct 
model  has  been  found.  We  discuss  methods  for  computing  reasonable 
values  for  these  parameters  in  the  following  subsections. 

A.  Error  Tolerance  For  Establishing  Datum/Model  Compatibility 

The  deviation  of  a  datum  from  a  model  is  a  function  of  the  error 
associated  with  the  datum  and  the  error  associated  with  the  model 
(which,  in  part,  is  a  function  of  the  errors  associated  with  the  data 
used  to  instantiate  the  model).  If  the  model  is  a  simple  function  of 
the  data  points,  it  may  be  practical  to  establish  reasonable  bounds  on 
error  tolerance  analytically.  However,  this  straightforward  approach  is 
often  unworkable;  for  such  cases  it  is  generally  possible  to  estimate 
bounds  on  error  tolerance  experimentally.  Sample  deviations  can  be 
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produced  fay  perturbing  the  data,  computing  the  model,  and  measuring  the 
implied  errors.  The  error  tolerance  could  then  be  set  at  one  or  two 
standard  deviations  beyond  the  measured  average  error. 

The  expected  deviation  of  datum  from  an  assumed  model  is  generally 
a  function  of  the  datum  and,  therefore,  the  error  tolerance  should  be 
different  for  each  datum.  However,  the  variation  in  error  tolerances  is 
usually  relatively  small  compared  to  the  size  of  a  gross  error.  Thus,  a 
single  error  tolerance  for  all  data  is  often  sufficient. 

B.  The  Maximum  Number  of  Attempts  to  Find  a  Consensus  Set 

The  decision  to  stop  selecting  new  subsets  of  P  can  be  based  upon 
the  expected  number  of  trials  k  required  to  select  a  subset  of  n  good 
data  points.  Let  w  be  the  probability  that  any  selected  data  point  is 
within  the  error  tolerance  of  the  model.  Then  we  have: 

E(k)  -  b  +  2* ( 1  -b ) *b  +  3*(l-b)2  *b  ...  +  ^(l-h)1"1^  +  ... 

E(k)  =  b* [  1  +  2*a  +  3*a2  ...  +  i*ai_1  +  ...] 
where:  E(k)  is  the  expected  value  of  k,  b  =  wn,  and  a  =  (1-b). 

An  identity  for  the  sum  of  a  geometric  series  is: 

a/  (1-a)  =•  a  +  a2  +  a3  .  . .  +  a1  +  . . . 

Differentiating  the  above  identity  with  respect  to  a,  we  have: 
l/(l-a)2  =  l  +  2*a  +  3*a^  ...  +  i*a^“l  +  ... 

Thus : 


E(k)  =  1/b  -  w"n 


The  following  is  a  tabulation,  of  some  values  of  E  (k)  for 
corresponding  values  of  n  and  w: 


w 

n  =*  1 

2 

3 

4 

5 

6 

.9 

1.1 

1.2 

1.4 

1.5 

1.7 

1.9 

.8 

1.3 

1.6 

2.0 

2.4 

3.0 

3.8 

.7 

1.4 

2.0 

2.9 

4.2 

5.9 

8.5 

.6 

1.7 

2.8 

4.6 

7.7 

13 

21 

.5 

2.0 

4.0 

8.0 

16 

32 

64 

.4 

2.5 

6.3 

16 

39 

98 

244 

.3 

3.3 

11 

37 

123 

412 

.2 

5.0 

25 

125 

625 

In  general,  we  would  probably  want  to  exceed  E(k)  trials  by  one  or 
two  standard  deviations  before  we  give  up.  We  note  that  the  standard 
deviation  of  k,  SD(k),  is  given  by: 

SD(k)  =■  sqrt[E(k2)  _  E(k)2] 

Then: 

E(k2)  =  SIGMA(i) :  {b*i2*ai“1> 

=  SIGMA(i):  <b*i*(i-l)*ai_1>  +  SISiA(i):  {b*i*ai"1> 
but  (using  the  geometric  series  identity  and  two  differentiations): 
2a/(l-a)3  =  SIGMA(i) :  {iMi-l^a1-1}  - 

thus: 

E(k2)  =  (2-b) / (b2) 

and: 

SD(k)  =»  [sqrt(l  -  wn)]*(l/wn) 

We  note  that  generally  SD(k)  will  be  approximately  equal  to  E(k); 
thus,  for  example,  if  (w  =  .5)  and  (n  =4),  then  E(k)  =16  and 
SD(k)  =*  15.5.  This  means  that  we  might  want  to  try  two  or  three  times 
the  expected  number  of  random  selections  implied  by  k  (as  tabulated 
above)  to  obtain  a  consensus  set  of  more  than  t  members. 
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From  a  slightly  different  point  of  view,  if  we  want  to  ensure  with 
probability  2  that  at  least  one  of  our  random  selections  is  an  error- 
free  set  of  n  data  points,  then  we  must  expect  to  make  at  least  k 
selections  (n  data  points  per  selection),  where: 

(l-b)k  =  (1-2) 
k  =  [ log ( 1—2 ) ] / [ log { 1— b ) ] 

For  example,  if  (w  =  .5)  and  (n  =  4),  then  (b  =  1/16).  To  obtain  a 
90%  assurance  of  making  at  least  one  error-free  selection, 

k  -  log (. l)/log(15/16)  =35.7 


C .  A  Lower  Bound  On  the  Si2e  of  an  Ac  cep  tab le  Consensus  Set 

The  threshold  t,  an  unspecified  parameter  in  the  formal  statement 
of  the  RANSAC  paradigm,  is  used  as  the  basis  for  determining  that  an  n- 
subset  of  P  has  been  found  that  implies  a  sufficiently  large  consensus 
set  to  permit  the  algorithm  to  terminate.  Thus,  t  must  be  chosen  large 
enough  to  satisfy  two  purposes:  that  the  correct  model  has  been  found 
for  the  data;  and  that  a  sufficient  number  of  mutually  consistent  points 
has  been  found  to  satisfy  the  needs  of  the  final  smoothing  procedure 
(which  computes  improved  estimates  for  the  model  parameters). 

To  ensure  against  the  possibility  of  the  final  consensus  set  being 
compatible  with  an  incorrect  model,  and  assuming  that  y  is  the 
probability  that  any  given  data  point  is  within  the  error  tolerance  of 
an  incorrect  model,  we  would  like  yt_n  to  be  very  small.  While  there  is 
no  general  way  of  precisely  determining  y,  it  is  certainly  reasonable  to 
assume  that  it  is  less  than  w  (w  is  the  a  priori  probability  that  a 
given  data  point  is  within  the  error  tolerance  of  the  correct  model). 
Assuming  y  <  .5,  a  value  of  t-n  equal  to  5  will  provide  a  better  than 
95%  probability  that  compatibility  with  an  incorrect  model  will  not 
occur . 
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To  satisfy  the  needs  of  the  final  smoothing  procedure,  the 
particular  procedure  to  be  employed  must  be  specified;  if  least-squares 
smoothing  is  to  be  used,  there  are  many  situations  where  formal  methods 
can  be  invoked  to  determine  the  number  of  points  required  to  produce  a 
desired  precision  (e.g.,  see  Sorenson  [1970]), 

D .  Example 

Let  us  apply  RANSAC  to  the  example  described  in  Figure  1.  A  value 
of  w  (the  probability  that  any  selected  data  point  is  within  the  error 
tolerance  of  the  model)  equal  to  .85  is  consistent  with  the  data,  and  a 
tolerance  (to  establish  datum/model  compatibility)  of  .8  units  was 
supplied  as  part  of  the  problem  statement.  We  will  accept  the  RANSAC- 
supplied  model  without  external  smoothing  of  the  final  consensus  set; 
thus,  we  would  like  to  obtain  a  consensus  set  that  contains  all  seven 
data  points.  Since  one  of  these  points  is  a  gross  error,  it  is  obvious 
that  we  will  not  find  a  consensus  set  of  the  desired  size,  and  so  we 
will  terminate  with  the  largest  set  we  are  able  to  find.  The  theory 
presented  earlier  indicates  that  if  we  take  two  data  points  at  a  time, 
compute  the  line  through  them  and  measure  the  deviations  of  the 
remaining  points  from  this  line,  we  should  expect  to  find  a  suitable 
consensus  set  within  two  or  three  trials;  however,  because  of  the 
limited  amount  of  data,  we  might  be  willing  to  try  all  21  combinations 
to  find  the  largest  consensus  set.  In  either  case,  we  easily  find  the 
consensus  set  containing  the  six  valid  data  points  and  the  line  that 
they  imply. 


8 


Ill  THE  LOCATION  DETERMINATION  PROBLEM  (LDP) 


A  core  problem  In  image  analysis  is  that  of  establishing  a 
correspondence  between  the  elements  of  two  representations  of  a  given 
scene.  One  variation  of  this  problem,  especially  important  in 
cartography,  is  to  determine  the  location  in  space  from  which  an  image 
or  photograph  was  obtained  by  recognizing  a  set  of  landmarks  ("control 
points")  appearing  in  the  image  (this  is  variously  called  the  problem  of 
determining  the  elements  of  exterior  camera  orientation,  or  the  camera 
calibration  problem,  or  the  Image-to-data-base-correspondence  problem) . 
It  is  routinely  solved  using  a  least-squares  technique  (e.g.,  see  Wolf 
[1974]  or  Keller  [1966])  with  a  human  operator  interactively 
establishing  the  association  between  image  points  and  the  three- 
dimensional  coordinates  of  the  corresponding  landmarks.  However,  in  a 
fully  automated  system,  where  the  correspondences  must  be  based  on  the 
decisions  of  marginally  competent  feature  detectors,  least  squares  is 
often  incapable  of  dealing  with  the  gross  errors  that  may  result;  this 
consideration,  discussed  at  length  in  the  preceding  section,  is 
illustrated  for  the  Location  Determination  Problem  In  an  example  to  be 
presented  later  (see  the  section  on  experimental  results). 

In  this  section  we  present  a  new  solution  to  the  Location 
Determination  Problem  (LDP)  based  on  the  RANSAC  paradigm,  which  is 
unique  in  its  ability  to  tolerate  gross  errors  in  the  input  data.  We 
will  first  examine  the  conditions  under  which  a  solution  to  the  LDP  is 
possible  and  describe  new  results  concerning  this  question;  we  then 
present  a  complete  description  of  the  RANSAC-based  algorithm,  and 
finally,  describe  experimental  results  obtained  through  use  of  the 
algorithm. 

We  formally  define  the  LDP  as  follows: 

Given  a  set  of  m  landmarks,  whose  3-D  coordinates  are 
known  in  some  coordinate  frame,  and  given  an  image  in 
which  some  subset  of  the  m  landmarks  is  visible,  determine 
the  location  (relative  to  the  coordinate  system  of  the 
landmarks)  from  which  the  image  was  obtained. 
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We  will  initially  assume  that  we  know  the  correspondences  between  n 
image  points  and  landmarks;  later  we  consider  the  situation  in  which 
some  of  these  correspondences  are  invalid.  We  will  also  assume  that 
both  the  principal  point  in  the  image  plane  (where  the  optical  axis  of 
the  camera  pierces  the  image  plane)  and  the  focal  length  of  the  imaging 
system  are  known;  thus  (see  Figure  2)  we  can  easily  compute  the  angle  to 
any  pair  of  landmarks  from  the  Center  of  Perspective  (CP).  Finally,  we 
assume  that  the  camera  resides  outside  and  "above"  a  convex  hull 
enclosing  the  control  points. 

We  will  later  demonstrate  (Appendix  A)  that  if  we  can  compute  the 
lengths  of  the  rays  from  the  CP  to  three  of  the  landmarks,  then  we  can 
directly  solve  for  the  location  of  the  CP  (and  the  orientation  of  the 
image  plane  if  desired).  Thus,  an  equivalent,  but  mathematically  more 
concise  statement  of  the  LDP,  is: 

Given  the  relative  spatial  locations  of  n  control  points, 
and  given  the  angle  to  every  pair  of  control  points  from 
an  additional  point  called  the  Center  of  Perspective  (CP), 
find  the  lengths  of  the  line  segments  ("legs")  joining 
the  CP  to  each  of  the  control  points.  We  call  this  the 
"perspective-n-point"  problem  (PnP). 

In  order  to  apply  the  RANSAC  paradigm,  we  wish  to  determine  the 
smallest  value  of  n  for  which  it  is  possible  to  solve  the  PnP  problem. 

A.  Solution  of  the  Perspective-N -Point  Problem 

The  PIP  problem  (n  =»  1)  provides  no  constraining  information,  and 
thus  an  infinity  of  solutions  is  possible.  The  P2P  problem  (n  =  2), 
illustrated  in  Figure  3,  also  admits  an  infinity  of  solutions;  the  CP 
can  reside  anywhere  on  a  circle  of  diameter  Rab/sin(Oab) ,  rotated  in 
space  about  the  chord  (line)  joining  the  two  control  points  A  and  B. 

The  P3P  problem  (n  =»  3)  requires  that  we  determine  the  lengths  of 
the  three  legs  of  a  tetrahedron,  given  the  base  dimensions  and  the  face 
angles  of  the  opposing  trihedral  angle  (see  Figure  4).  The  solution  to 
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this  problem  is  implied  by  the  three  equations  [A*]: 


(Rab)2  «  a2  +  b2  -  2*a*b*  [Cos (Gab)  ] 

(Rac)2  =  a2  +  c2  -  2*a*c*  [Cos (9ac)  ]  [A*] 
(Rbc) 2  =  b2  +  c2  -  2*b*c*[Cos(9bc)  ] 

It  is  known  that  n  independent  polynomial  equations,  in  n  unknowns, 
can  have  no  more  solutions  than  the  product  of  their  respective  degrees. 
Thus,  the  system  A*  can  have  a  maximum  of  eight  solutions.  However, 
because  every  term  in  the  system  A*  is  either  a  constant,  or  of  second, 
degree,  for  every  real  positive  solution  there  is  a  geometrically 
isomorphic  negative  solution.  Thus,  there  are  at  most  four  positive 
solutions  to  A*,  and  in  Figure  5  we  show  an  example  demonstrating  that 
the  upper  bound  of  four  solutions  is  attainable. 

In  Appendix  A  we  derive  an  explicit  algebraic  solution  for  the 
system  A*.  This  is  accomplished  by  reducing  A*  to  a  biquadratic 
(quartic)  polynomial  in  one  unknown  representing  the  ratio  of  two  legs 
of  the  tetrahedron,  and  then  directly  solving  this  equation  (we  also 
present  a  very  simple  iterative  method  for  obtaining  the  solutions  from 
the  given  problem  data). 

For  the  case  n  =■  4,  when  all  four  control  points  lie  in  a  common 
plane  (not  containing  the  CP,  and  such  that  no  more  than  two  of  the 
control  points  lie  on  any  single  line),  we  provide  a  technique,  in 
Appendix  B,  that  will  always  produce  a  unique  solution.  Surprisingly, 
when  all  four  control  points  do  not  lie  in  the  same  plane,  a  unique 
solution  cannot  always  be  assured;  an  example,  presented  in  Figure  6, 
shows  that  at  least  two  solutions  are  possible  for  the  P4P  problem  with 
the  control  points  in  "general  position." 

To  solve  for  the  location  of  the  CP  in  the  case  of  four  nonplanar 
control  points,  we  can  use  the  algorithm  presented  in  Appendix  A  on  two 
distinct  subsets  of  the  control  points  taken  three  at  a  time;  the 
solution(s)  common  to  both  subsets  locate  the  CP  to  within  the  ambiguity 
inherent  in  the  given  information. 
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The  approach  used  Co  construct  the  example  shown  in  Figure  6  can  be 
extended  to  any  number  of  additional  points.  It  is  based  on  the 
principal  depicted  in  Figure  3:  if  the  CP  and  any  number  of  control 
points  lie  on  the  same  circle,  then  the  angle  between  any  pair  of 
control  points  and  the  CP  will  be  independent  of  the  location  on  the 
circle  of  the  CP  (and  hence  the  location  of  the  CP  cannot  be 
determined).  Thus,  we  are  able  to  construct  the  example  shown  in 
Figure  7,  in  which  five  control  points  in  general  position  imply  two 
solutions  to  the  P5P  problem.  While  the  same  technique  will  work  for 
six  or  more  control  points,  four  or  more  of  these  points  must  now  lie  in 
same  plane  and  are  thus  no  longer  in  general  position. 

To  prove  that  six  (or  more)  control  points  in  general  position  will 
always  produce  a  unique  solution  to  the  P6P  problem,  we  note  that  for 
this  case  we  can  always  solve  for  the  12  coefficients  of  the  3x4 
matrix  T  that  specifies  the  mapping  (in  homogeneous  coordinates)  from 
three  space  to  two  space;  each  of  the  six  correspondences  provides  three 
new  equations  and  introduces  one  additional  unknown  (the  homogeneous 
coordinate  scale  factor).  Thus,  for  six  control  points,  we  have  18 
linear  equations  to  solve  for  the  18  unknowns  (actually,  it  can  be  shown 
that,  at  most,  17  of  the  unknowns  are  independent).  Given  the 
transformation  matrix  T,  we  can  construct  an  additional  (synthetic) 
control  point  lying  in  a  common  plane  with  three  of  the  given  control 
paints  and  compute  its  location  in  the  image  plane;  the  technique 
described  in  Appendix  B  can  now  be  used  to  find  a  unique  solution. 
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IV  IMPLEMENTATION  DETAILS  AND  EXPERIMENTAL  RESULTS 


A.  The  RANSAC/LD  Algorithm 

The  RANSAC/LD  algorithm  accepts  as  input  the  following  data: 

(1)  A  list  L  of  m  6-tuples — each  6-tuple  containing  the  3-D 
spatial  coordinates  of  a  control  point,  its  corresponding 
2-D  image  plane  coordinates,  and  an  optional  number 
giving  the  expected  error  (in  pixels)  of  the  given 
location  in  the  image  plane. 

(2)  The  focal  length  of  the  imaging  system  and  the  image 
plane  coordinates  of  the  principal  point. 

(3)  The  probability  (1-w)  that  a  6-tuple  contains  a  gross 
mismatch . 

(4)  A  "confidence"  number  G  which  is  used  to  set  the  internal 
thresholds  for  acceptance  of  intermediate  results 
contributing  to  a  solution.  A  confidence  number  of  one 
forces  very  conservative  behavior  on  the  algorithm;  a 
confidence  number  of  zero  will  call  almost  anything  a 
valid  solution. 

The  RANSAC/LD  algorithm  produces  as  output  the  following 
information: 

(1)  The  3-D  spatial  coordinates  of  the  lens  center  (i.e.,  the 
Center  of  Perspective),  and  an  estimate  of  the 
corresponding  error. 

(2)  The  spatial  orientation  of  the  image  plane. 

The  RANSAC/LD  algorithm  operates  as  follows: 

(1)  Three  6-tuples  are  selected  from  list  L-by  a  quasi-random 
method  that  ensures  a  reasonable  spatial  distribution  for 
the  corresponding  control  points.  This  initial  selection 
is  called  SI. 

(2)  The  CP  (called  CPI)  corresponding  to  selection  SI  is 
determined  using  the  closed-form  solution  provided  in 
Appendix  A;  multiple  solutions  are  treated  as  if  they 
were  obtained  from  separate  selections  in  the  following 
steps . 

(3)  The  error  in  the  derived  location  of  CPI  is  estimated  by 
perturbing  the  input  coordinates  (either  by  the  amount 
specified  in  the  6-tuples  or  by  a  default  value  of  one 
pixel),  and  recomputing  the  effect  this  would  have  on  the 
location  of  the  CPI. 
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(4)  Given  the  error  estimate  for  the  CPI,  we  use  the 
technique  described  in  Bolles  [1978]  to  determine  error 
ellipses  (dimensions  based  upon  the  supplied  confidence 
number)  in  the  image  plane  for  each  of  the  control  points 
specified  in  list  L;  if  the  associated  image  coordinates 
reside  within  the  corresponding  error  ellipse,  then  the 
6-tuple  is  appended  to  the  consensus  set  Sl/CPl. 

(5)  If  the  size  of  Sl/CPl  equals  or  exceeds  some  threshold 
value  t  (nominally  equal  to  a  value  between  7  and  raw), 
then  the  consensus  set  Sl/CPl  is  supplied  to  a  least- 
squares  routine  (see  Bolles  [1978]  or  Gennery  [1975])  for 
final  determination  of  the  CP  location  and  image  plane 
orientation.*  Otherwise,  the  above  steps  are  repeated 
with  a  new  random  selection  S2,  S3,  ... 

(6)  If  the  number  of  iterations  of  the  above  steps  exceeds 
k  ■  [log ( 1— G) ] / [log(l-w^) ] ,  then  the  largest  consensus 
set  found  so  far  is  used  to  compute  the  final  solution 
(or  we  terminate  in  failure  if  this  largest  consensus  set 
contains  fewer  than  six  members). 


B .  Experimental  Results 

To  demonstrate  the  validity  of  our  theoretical  results,  we 

performed  three  experiments.  In  the  first  experiment  we  found  a 

specific  Location  Determination  Problem  in  which  the  common  least- 
squares  pruning  heuristic  failed,  and  showed  that  RANSAC  successfully 
solved  this  problem.  In  the  second  experiment,  we  applied  RANSAC  to 
fifty  synthetic  problems  in  order  to  check  the  reliability  of  the 

approach  over  a  wide  range  of  parameter  values.  In  the  third  experiment 

we  used  standard  feature  detection  techniques  to  locate  landmarks  in  an 
aerial  image  and  then  used  RANSAC  to  determine  the  position  and 
orientation  of  the  camera. 


*  An  alternative  to  least  squares  would  be  to  average  the  parameters 
computed  from  random  triples  in  the  consensus  set  that  fall  within  (say) 
the  center  50%  of  the  associated  histogram. 
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A  Location  Determination  Problem  Ex amp le  of  a^  Leas t-S quares  Pruning 
Error 


The  LDP  in  this  experiment  was  based  upon  20  landmarks  and  their 
locations  in  an  image.  Five  of  the  twenty  correspondences  were  gross 
errors;  that  is,  their  given  locations  in  the  image  were  further  than  10 
pixels  from  their  actual  locations.  The  image  locations  for  the  "good*' 
correspondences  were  normally  distributed  about  their  actual  locations 
with  a  standard  deviation  of  one  pixel. 

The  heuristic  to  prune  gross  errors  was  the  following: 

*  Use  all  of  the  correspondences  to  instantiate  a  model. 

*  On  the  basis  of  that  model,  delete  the  correspondence  that 
has  the  largest  deviation  from  its  predicted  image 
location. 

*  Instantiate  a  new  model  without  that  correspondence. 

*  If  the  new  model  implies  a  normalized  error  for  the  deleted 
correspondence  that  is  larger  than  three  standard 
deviations,  assume  that  it  is  a  gross  error,  leave  it  out, 
and  continue  deleting  correspondences.  Otherwise,  assume 
that  it  is  a  good  correspondence  and  return  the  model  that 
included  it  as  the  solution  to  the  problem. 

This  heuristic  successfully  deleted  two  of  the  gross  errors;  but 
after  deleting  a  third,  it  decided  that  the  new  model  did  not  imply  a 
significantly  large  error,  so  it  returned  a  solution  based  upon  eighteen 
correspondences,  three  of  which  were  gross  errors. 

When  RANSAC  was  applied  to  this  problem,  it  located  the  correct 
solution  on  the  second  triple  of  selected  points.  The  final  consensus 
set  contained  all  of  the  good  correspondences  and  none  of  the  gross 
errors . 


D.  Fifty  Synthetic  Location  Determination  Problems 

In  this  experiment  we  applied  RANSAC  to  fifty  synthetic  LDPs.  Each 
problem  was  based  upon  thirty  landmark-to-image  correspondences.  A 
range  of  probabilities  were  used  to  determine  the  number  of  gross  errors 
in  the  problems;  the  image  location  of  a  gross  error  was  at  least  10 
pixels  from  its  actual  location.  The  location  of  a  good  correspondence 
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was  distributed  about  its  actual  location  with  a  normal  distribution 
having  a  standard  deviation  of  one  pixel.  Two  different  camera 
positions  were  used--one  looking  straight  down  on  the  landmarks  and  one 
looking  at  them  from  an  oblique  angle.  The  RANSAC  algorithm  described 
earlier  in  this  section  was  applied  to  these  problems;  however,  the 
simple  iterative  technique  described  in  Appendix  A  was  used  to  locate 
solutions  to  the  P3P  problems  in  place  of  the  closed  form  method  also 
described  in  that  appendix,  and  a  second  least-squares  fit  was  used  to 
extend  the  final  consensus  set  (as  suggested  in  second  section  of  this 
paper).  Table  1  summarizes  the  results  for  ten  typical  problems  (RANSAC 
successfully  avoided  including  a  gross  error  in  its  final  consensus  set 
in  all  of  the  problems);  in  five  of  these  problems  the  probability  of  a 
good  correspondence  was  0.8,  and  in  the  other  five  problems  it  was  0.6. 
The  execution  time  for  the  current  program  is  approximately  one  second 
for  each  camera  position  considered. 


No .  of 
Good 

Corresp. 

No.  of 
Corresp . 
in  Final 
Consensus 
Set 

No .  of 

Tr iples 
Cons  id e red 

No.  of 

Camera 

Positions 
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22 
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6 

10 

23 
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19 
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25 

25 
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2 

24 

23 

3 

8 

21 
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11 

21 

17 

17 

1 

1 

17 

16 

6 

8 

18 

16 

9 

21 

21 

18 

9 

15 

TABLE  1 
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A  ’'Real1'  Location  Determination  Problem 


Cross-correlation  was  used  to  locate  25  landmarks  in  an  aerial 
image  taken  from  approximately  4,000  feet  with  a  6-inch  lens.  The  image 
was  digitized  on  a  grid  of  2,000  by  2,000  pixels,  which  implies  a  ground 
resolution  of  approximately  two  feet  per  pixel.  Three  gross  errors  were 
made  by  the  correlation  feature  detector.  When  RANSAC  was  applied  to 
this  problem,  it  located  a  consensus  set  of  17  on  the  first  triple 
selected  and  then  extended  that  set  to  include  all  22  good 
correspondences  after  the  initial  least-squares  fit.  The  final  standard 
deviations  about  the  camera  parameters  were  as  follows: 


X: 

0.1 

feet 

Heading: 

.01  degrees 

Y: 

6.4 

feet 

Pitch: 

.10  degrees 

Z: 

2.1 

feet 

Roll: 

.12  degrees 

V  CONCLUDING  COMMENTS 

In  this  paper  we  have  introduced  a  new  paradigm.  Random  Sample 
Consensus  (RANSAC),  for  fitting  a  model  to  experimental  data.  RANSAC  is 
capable  of  interpreting/smoothing  data  containing  a  significant 
percentage  of  gross  errors,  and  thus  is  ideally  suited  for  applications 
in  automated  image  analysis  where  interpretation  is  based  on  the  data 
provided  by  error-prone  feature  detectors. 

A  major  portion  of  this  paper  describes  the  application  of  RANSAC 
to  the  Location  Determination  Problem  (LDP):  given  an  image  depicting  a 
set  of  landmarks  with  known  locations,  determine  that  point  in  space 
from  which  the  image  was  obtained.  Most  of  the  results  we  present 
concerning  solution  techniques  and  the  geometry  of  the  LDP  problem  are 
either  new  or  not  generally  known.  The  current  photo gramme trie 
literature  offers  no  analytic  solution,  other  than  variants  of  least 
squares  and  the  Church  method,  for  solving  the  perspective-n-point 
problems.  The  Church  method,  which  provides  an  iterative  solution  for 
the  P3P  problem,  is  presented  (see  Church  [1945]  or  Wolf  [1974])  without 
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any  indication  that  more  than  one  physically  real  solution  is  possible; 
there  is  certainly  no  indication  that  anyone  realizes  that  physically 
real  multiple  solutions  are  possible  for  more  than  three  control  points 
in  general  position.  (It  should  be  noted  that  because  the  multiple 
solutions  can  be  arbitrarily  close  together,  even  when  an  iterative 
technique  is  initialized  to  a  value  close  to  the  correct  solution,  there 
is  no  assurance  that  it  will  converge  to  the  desired  value). 

In  the  section  on  the  LDP  problem  (and  associated  appendices)  we 
have  completely  characterized  the  P3P  problem  and  provided  a  closed-form 
solution.  We  have  shown  that  multiple  physically  real  solutions  can 
exist  for  the  P4P  and  P5P  problems,  but  also  demonstrated  that  a  unique 
solution  is  assured  when  four  of  the  control  points  reside  on  a  common 
plane  (solution  techniques  are  provided  for  each  of  these  cases).  The 
issue  of  determining  the  maximum  number  of  solutions  possible  for  the 
P4P  and  P5P  problems  remains  open,  but  we  have  shown  that  a  unique 
solution  exists  for  the  P6P  problem  when  the  control  points  are  in 
general  position. 
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Appendix  A 


AN  ANALYTIC  SOLUTION  FOR  THE  PERSPECTIVE-3-POINT  PROBLEM 

In  the  main  body  of  this  paper,  we  have  established  that  P3P 
problems  can  have  as  many  as  four  solutions.  In  this  appendix,  we  will 
derive  a  closed  form  expression  for  obtaining  these  solutions.  Our 
approach  involves  three  steps:  we  first  find  the  lengths  of  the  legs  of 
the  ("perspective")  tetrahedron  given  the  base  (defined  by  the  three 
control  points)  and  the  face  angles  of  the  opposing  trihedral  angle  (the 
three  angles  to  the  three  pairs  of  control  points  as  viewed  from  the 
CP);  we  next  locate  the  CP  with  respect  to  the  3-D  reference  frame  in 
which  the  control  points  were  originally  specified;  and  finally,  compute 
the  orientation  of  the  image  plane  with  respect  to  the  reference  frame. 

a .  A  Solution  for  the  Perspective  Tetrahedron  (see  Figure  4) 

Given  the  lengths  of  the  three  sides  of  the  base  of  a 
tetrahedron  (Rab,Rac,Rbc) ,  and  given  the  corresponding  face  angles  of 
the  opposing  trihedral  angle  (9ab,9ac, 9b c) ,  find  the  lengths  of  the 
three  remaining  sides  of  the  tetrahedron  (a,b,c). 

A  solution  to  the  above  problem  can  be  obtained  by 
simultaneously  solving  the  system  of  equations: 

[Al]  (Rab)2  =  a2  +  b2  -  2*a*b*cos (9ab) 

[A2]  (Rac)2  =  a^  ■+•  c2  -  2*a*c*cos  (9ac) 

[A3]  (Rbc)2  a  b2  +  c2  -  2*b*c*cos (9bc) 

We  now  proceed  as  follows: 

[A4]  Let  b  =  x*a  and  c  =  y*a 

[A5]  (Rac)2  =>  a2  +  (y2)*(a2)  -  2*(a2)*y*cos(9ac) 

[A6]  (Rab)2  =>  a2  +  (x2)*(a2)  -  2*(a2) *x*cos (9ab) 

[A7]  (Rbc)2  =  (x2)*(a2)  +  (y2)*(a2)  -  2* (a2) *x*y*cos (9bc) 
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from  [A5]  and  [A7] 


[A8J  ((Rbc)2)*[  1  +  (y2)  -  2*y*  cos(9ac)  ]  = 

(CRac)2)*[  (x2)  +  (y2)  -  2*x*y*cos (9bc)  ] 

from  [A6]  and  [A7] 

[A9]  ((Rbc)2)*[  1  +  (x2)  -  2*x*cos(9ab)  ]  = 

(CRab)2)*[  (x2)  +  (y2)  -  2*x*y*cos (9bc)  ] 

(Rbc)2  (Rbc) 2 

[A10]  Let - -  Kl  and - -  K2 

(Rac)2  (Rab)2 

From  [A8]  and  [A9] 

[All]  0  =  (y2)*[l-Kl]  +  2*y*  [Kl*cos  (9ac) -x*cos  (9bc)] 

+  [ (x2)-K1] 

From  [A9]  and  [A10] 

[A12]  0  =  (y2)  +  2*y* [-x*cos (9bc) ] 

+  [(x2)*(1-K2)  +  2*x*K2*cos(9ab)  -  K2] 

Equations  [All]  and  [A12]  have  the  form: 

[A13]  0  =  m*(y2)  +  p*y  +  q 

[A14]  0  =  m'*(y2)  +  p'*y  +  q' 

Multiplying  [A13]  and  [A 14]  by  m'  and  m  respectively,  and  subtracting 
[A15]  0  =  [p*m'  -  p'*m]*y  +  [m'*q  -  m*q'] 


21 


Multiplying  [A13]  and  [A14]  by  q'  and  q  respectively,  subtracting,  and 
dividing  by  y: 

0  =*  [m'*q  -  m*q']*(y2)  +  [p'*q  -  p*q']*y 
[A16]  0  =  [m'*q  -  m*q']*y  +  [p'*q  -  p*q'] 

Assuming  m'*q  is  not  equal  (//)  to  m*q',  that  is 

C(x2)-K1]  0  [(X2)*(1-K1)*(1-K2)  +  2*x*K2*  (1-K1)  *cos  (9ab)  -  (1-K1)*K2] 

then  [A15]  and  [A16]  are  equivalent  to  [A  13]  and  [A14] . 

We  now  multiply  [A15]  by  [m'*q  -  m*q' ] ,  and  multiply  [A16]  by 
[p*ra'  -  p'*m],  and  subtract  to  obtain: 

[A17]  0  =  (m'*q  -  m*q')2  -  [p*m'  -  p'*m]*[p'*q  -p*q'  ] 

Expanding  [A17]  and  grouping  terms  we  obtain  a  biquadratic  (quartic) 
polynomial  in  x: 

[A  18 ]  0  =>  G4*(x^)  +  G3*(x3)  +  G2*(x2)  +  Gl*(x)  +  GO 

where : 

[A  19]  G4  =  (K1*K2  -  K1  -  K2)2  -  4*K1*K2*  [Cos(9bc)2] 

[A20]  G3  =•  4*  [K1*K2-Kl-K2]  *K2*  (1-K1) *Cos (9ab) 

+  4*Kl*Cos(9bc)*[(Kl*K2+K2  -Kl)*Cos(9ac) 

+  2*K2*Cos (9ab) *cos (9bc) ] 

[A21]  G2  -  [2*K2* (l-Kl)Cos(Oab) ] 2 

+  2* [K1*K2+Kl-K2] * [K1*K2— K1-K2] 

+  4*Kl*[(Kl-K2)*(Cos(«bc)2)  +  (1-K2)*K1* (Cos(9ac)2) 

-  2*K2*(1+K1 ) *Cos (9ab)*Cos  (9ac)  *Cos  (9bc) ] 

[A22]  G1  =  4*  (K1*K2+K1-K2 )  *K2*  (1  -K1 )  *Cos  (Gab) 

+  4*K1* [ (Kl*K2-Kl+K2)*Cos (9ac)*Cos (9bc) 

+  2*K1*K2  *Cos(9ab)*(Cos(9ac)2) ] 

[A23]  GO  =  (K1*K2+K1-K2)2  -  4*(K12 ) *K2* (Cos(9ac) 2) 
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Roots  of  [A23]  can  be  found  in  closed  form  (see  Dehn  [I960]),  or  by 
iterative  techniques  (see  Conte  [1965]). 

For  each  positive  real  root  of  [A18],  we  determine  a  single  positive 
real  value  for  each  of  the  sides  "a"  and  Mb."  From  [A6]  we  have: 

Rab 

[A24]  a  =  - 

SQRT  [(x2)  -  2*x*Cos (&ab)  +  1] 

and  from  [A4]  we  obtain: 

[A25]  b  =  a*x 


If  m'*q  //  m*q',  then  from  [A16]  we  have: 

p'*q  -  p*q' 

[A26]  y  =■ - -~ 

m*q'  —  m**q 


If  m'*q  =»  m*q',  then  [A26]  is  undefined  and  we  obtain  two  values  of  y 
from  [A5]: 

(Rac)2  -  (a2) 

[A27]  y  =■  Cos(Qac)  +-  SQRT  [(Cos(Gac))2  + - ] 

(a2) 


For  each  real  positive  value  of  y,  we  obtain  a  value  of  "c"  from  [A4] : 

[A28]  c  =  y*a 

When  values  of  y  are  obtained  from  [A5],  rather  than  [A26] ,  the 
resulting  solutions  can  be  invalid;  they  must  be  shown  to  satisfy  [A3] 
before  they  are  accepted. 

It  should  be  noted  that  because  each  root  of  [A18]  can 
conceivably  lead  to  two  distinct  solutions,  the  existence  of  the 
biquadratic,  by  itself,  does  not  Imply  a  maximum  of  four  solutions  to 
the  P3P  problem;  some  additional  argument,  such  as  the  one  given  in  the 
main  body  of  this  paper,  is  necessary  to  establish  the  upper  bound  of 
four  solutions. 
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b .  Example 


For  the  perspective  tetrahedron  shown  in  Figure  5,  we  have  the 
following  parameters: 

Rab  =  Rac  =  Rbc  =>  2*SQRT(3) 


Cos(8ab)  =*  Cos(0ac)  =  Cos(9bc) 


(a2)  +  (b2)  -  (Rab)2 


2*a*b 
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Substituting  these  values  into  equations  [A19]  through  [A23] ,  we  obtain 
the  coefficients  of  the  biquadratic  defined  in  [A18] : 

[-.5625,  3.515625,  -5.90625,  3.515625,  -.5625] 


The  roots  of  the  above  equation  are: 

Cl,  1,  4,  0.25] 


For  each  root  we  have: 


ROOT  a 

1  4 

1  4 

4  1 

.25  4 


c 


4 

1 

4 

4 


c.  An  Iterative  Solution  for  the  Perspective  Tetrahedron  (see 
Figure  8) 

A  simple  way  to  locate  solutions  to  P3P  problems,  which  is 
sometimes  an  adequate  substitute  for  the  more  involved  procedure 
described  in  the  preceding  subsection,  is  to  slide  one  vertex  of  the 
control-point  triangle  down  its  leg  of  the  tetrahedron  and  look  for 
positions  of  the  triangle  in  which  the  other  two  vertices  lie  on  their 
respective  legs.  If  vertex  A  is  at  a  distance  "a"  from  L  (L  is  the 
center  of  perspective),  the  lengths  of  the  sides  Rab  and  Rac  restrict 
the  triangle  to  four  possible  positions.  Given  the  angle  between  legs 
LA  and  LB,  compute  the  distance  of  point  A  from  the  line  LB  and  then  . 
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compute  points  B1  and  B2  on  LB  that  are  at  the  proper  distance  from  A  to 
insert  a  line  segment  of  length  Rab.  Similarly,  we  compute  (at  most) 
two  locations  for  C  on  its  leg.  Thus,  given  a  position  for  A,  we  have 
found  (at  most)  four  positions  for  a  triangle  that  have  one  side  of 
length  Rab  and  one  of  length  Rac.  The  lengths  of  the  third  sides  (BC) 
of  the  four  triangles  vary  (non-linearly)  as  point  A  is  moved  down  its 
leg.  Solutions  to  the  problem  can  be  obtained  by  iteratively 
repositioning  A  to  imply  a  third  side  of  the  required  length. 


d.  Computing  the  3-D  Location  of  the  Center  of  Perspective  (see 
Figure  9) 

Given  the  three-dimensional  locations  of  the  three  control 
points  of  a  perspective  tetrahedron,  and  the  lengths  of  the  three  legs, 
the  3-D  location  of  the  center  of  perspective  can  be  computed  as 
follows : 

(1)  Construct  a  plane  PI  that  is  normal  to  AB  and  passes 
through  the  center  of  perspective,  L.  This  plane  can  be 
constructed  without  knowing  the  position  of  L,  which  is 
what  we  are  trying  to  compute.  Consider  the  face  of  the 
tetrahedron  that  contains  vertices  A,  B,  and  L.  Knowing 
the  lengths  of  sides  LA,  LB  and  AB,  we  can  use  the  law  of 
cosines  to  find  the  angle  LAB,  and  then  the  projection  QA 
of  LA  on  AB.  (Note  that  angle  LQA  is  a  right  angle,  and 
the  point  Q  is  that  point  on  line  AB  that  is  closest  to 
L).  Construct  a  plane  normal  to  AB  passing  through  Q; 
this  plane  also  passes  through  L. 

(2)  Similarly  construct  a  plane  P2  that  is  normal  to  AC  and 
passes  through  L. 

(3)  Construct  the  plane  P3  defined  by  the  three  points  A,  B, 
and  C. 


(4) 

Intersect  planes  PI, 

P2,  and 

P3. 

By  construction. 

the 

point  of  intersection 
closest  to  L. 

R  is 

the 

point  on  P3 

that 

is 

(5) 

Compute  the  length  of 

the 

line 

AR  and  use 

that 

in 

conjunction  with  the  length  of  LA  to  compute  the  length 
of  the  line  RL,  which  is  the  distance  of  L  from  the  plane 
P3. 

(6)  Compute  the  cross  product  of  vectors  AB  and  AC  to  form  a 
vector  perpendicular  to  P3.  Then  scale  that  vector  by 
the  length  of  RL  and  add  it  to  R  to  get  the  3-D  location 
of  the  center  of  perspective  L. 
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If  the  focal  length  of  the  camera  and  the  principal  point  in 

the  image  plane  are  known,  it  is  possible  to  compute  the  orientation  of 

the  image  plane  with  respect  to  the  world  coordinate  system;  that  is, 

the  location  of  the  origin  and  the  orientation  of  the  image  plane 

coordinate  system  with  respect  to  the  3-D  reference  frame.  This  can  be 
done  as  follows: 

(1)  Compute  the  3-D  reference  frame  coordinates  of  the  center 
of  perspective  (as  described  above) . 

(2)  Compute  the  3-D  coordinates  of  the  image  locations  of  the 
three  control  points:  since  we  know  the  3-D  coordinates 
of  the  CP  and  control  points,  we  can  compute  the  3-D 
coordinates  of  the  three  rays  between  the  CP  and  the 
control  points.  Knowing  the  focal  length  of  the  imaging 
system,  we  can  compute,  and  subtract  from  each  ray,  the 
distance  from  the  CP  to  the  image  plane  along  the  ray. 

(3)  Compute  the  equation  of  the  plane  containing  the  image 
using  the  three  points  found  in  step  (2).  The  normal  to 
this  plane,  passing  through  the  CP,  gives  us  the  origin 
of  the  image  plane  coordinate  system  (i.e.,  the  3-D 
location  of  the  principal  point),  and  the  Z  axis  of  this 
system. 

(4)  The  orientation  of  the  image  plane  about  the  Z  axis  can 
be  obtained  by  computing  the  3-D  coordinates  of  a  vector 
from  the  principal  point  to  any  one  .of  the  points  found 
in  (2). 
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Appendix  B 


AN  ANALYTIC  SOLUTION  FOR  THE  PERSPECTIVE-4-P0INT  PROBLEM 
(with  all  control  points  lying  in  a  common  plane) 

In  this  appendix,  we  present  an  analytic  technique  for  obtaining  a 
unique  solution  to  the  P4P  problem,  when  the  four  given  control  points 
all  lie  in  a  common  plane: 

a .  Problem  Statement  (see  Figure  10) 

GIVEN:  a  correspondence  between  four  points  lying  in  a  plane  in  3-D 
space  (called  the  object  plane),  and  four  points  lying  in  a  distinct 
plane  (called  the  image  plane);  and  given  the  distance  between  the 
center  of  perspective  and  the  image  plane  (i.e.  ,  the  focal  length  of  the 
imaging  system);  and  also  given  the  principal  point  in  the  image  plane 
(i.e.,  the  location,  in  image  plane  coordinates,  of  the  point  at  which 
the  optical  axis  of  the  lense  pierces  the  image  plane). 

FIND:  the  3-D  location  of  the  Center  of  Perspective  relative  to  the 
coordinate  system  of  the  object  plane. 

b .  Notation 

*  Let  the  four  given  image  points  be  labeled  {Pi},  and  the 
four  corresponding  object  points  {Qi}. 

*  We  will  assume  that  the  2-D  Image  Plane  coordinate  system 
has  its  origin  at  the  principal  point  (PPI). 

*  We  will  assume  that  the  Object  Plane  has  the  equation  Z  =  0 
in  the  reference  coordinate  system.  Standard  techniques 
are  available  to  transform  from  this  coordinate  system  into 
a  ground  reference  frame  (e.g.,  see  Duda  [1973]  or  Rogers 
[1976]). 

*  Homogeneous  coordinates  will  be  assumed  (e.g.,  see  Wylie 
[1970]). 

*  Primed  symbols  represent  transposed  structures. 
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Solution  Procedure 


c . 


a)  Compute  the  3x3  collineation  matrix  T  which  maps  points  from 
Object  Plane  to  Image  Plane  (a  procedure  for  computing  T  is  given 
later): 

(1)  [Pi]  =  [T ]  *  [Qi ] 

where  [Pi]  a  [ki*xi, ki*yi,ki] ' 

[Qi]  =  [Xi , Yi ,  1  ]  ' 


b)  The  ideal  line  in  the  Object  Plane,  with  coordinates  [0,0,1]',  is 
mapped  into  the  vanishing  line  in  the  Image  Plane  [VLI]  by  the 
transformation: 

(2)  [VLI]  =  [inv[T] ] '*[0,0,1]' 


c)  Determine  the  distance  DI  from  the  origin  of  the  Image  Plane  (PPI) 
to  the  vanishing  line  [VLI]  =*  [al,a2,a3]': 

(3)  |  a3  i 

|  sqrt[(al)2  +  (a2)2]  | 

d)  Solve  for  the  dihedral  (tilt)  angle  9  between  the  Image  and 
Object  planes: 


f 

(4)  9  =*  a  ret  an  ( - ) 

DI 

where  f  =■  focal  length 


e)  The  ideal  line  in  the  Image  Plane  with  coordinates  [0,0,1]'  is 
mapped  into  the  vanishing  line  in  the  Object  Plane  [VL0]  by  the 
transform: 

(5)  [VL0]  =*  [T] '*[0,0,1]  ' 


f )  Compute  the  location  of  point  [PP0]  in  the  Object  Plane  ( [PP0]  is 
the  point  at  which  the  optical  axis  of  the  lense  pierces  the  object 
plane) : 

(6)  [PP0]  -  [inv[T] ] '*[0,0,1]' 
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g)  Compute  the  distance  DO  from  [PPO]  =  [cl,c2,c3]'  to  the  vanishing 
line  [VLO]  =  [bl,b2,b3]'  in  the  Object  Plane: 

[  bl *cl  +  b2*c2  +  b3*c3  | 

(7)  DO  *  | - | 

| c3*sqrt [ (bl )2  +  (b2)2]  | 

h)  Solve  for  the  "pan"  angle  $  as  the  angle  between  the  normal  to 
[VLO]  =  [bl,b2,b3]'  and  the  X  axis  in  the  Object  Plane: 

-b2 

(8)  $  =>  arctan( - ) 

bl 


i)  Determine  XSGN  and  YSGN: 

If  a  line  (parallel  to  the  X  axis  in  the  object  plane)  through 
[PPO]  intersects  [VLO]  to  the  right  of  [PPO],  then  XSGN  =  1 
else  XSGN  =»  -1.  Thus 


bl*cl  +  b2*c2-+  b3*c3 
bl*c3 

then  XSGN  =  1  else  XSGN  =  -1 
Similarly, 

bl*cl  +  b2*c2  +  b3*c3 

(10)  if - <  0 

b2*c3 

then  YSGN  =  1  else  YSGN  =■  -1 


j)  Solve  for  the  location  of  the  CP  in  the  object  plane  coordinate 


system: 

(11) 

DCP  =  D0*sin(9) 

(12) 

XCP  =  XSGN*abs[DCP*Sin(9)*Cos($)] 

+  cl/c3 

(13) 

YCP  =  YSGN*abs[DCP*Sin(9)*Sin($)] 

+  c2/c 3 

(H) 

ZCP  =  DCP*cos (9) 

Note:  If  [VLI] ,  as  determined  in  (b),  has  the  coordinates  [0,0,k], 

then  the  image  and  object  planes  are  parallel  (9=0).  Rather 
than  continuing  with  the  above  procedure,  we  now  solve  for  the 
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desired  information  using  similar  triangles  and  Euclidean 
geometry. 


d.  Computing  the  Collineat ion  Matrix  T 


Let: 

I  I  XI  Y1  1|| 

[Q]  =  ||  X2  Y2  1||=  [[Q1]',[Q2]',[Q3]'] 

I  I  X3  Y3  1|| 

I |  xl  yl  1|| 

[P]  =  ||  x2  y2  1  M  -  [[P1]\[P2]',[P3]'] 

| |  x3  y3  1|| 


[Q4]  =  [X4,Y4, 1] ' 

[P4]  =  [x4, y4, 1  ] ' 

[V]  =  [inv  [P]  ]  '*  [P4]  =  [vl  ,v2, v3  ]  ' 

[R]  =  [inv  [Q]  ]  '*  [Q4]  =  [rl,r2,r3]' 

vl  r3 
wl  =  —  *  — 

rl  v3 

v2  r3 
w2  =  —  *  — 

r2  v3 


I  I  wl  0  0|| 

[w]  =  | |  0  w2  0|| 

110  0  1|| 


Then: 

[T]  '  =  [INV  [Q]  ]  *  [W]  *  [P ] 
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Such  that: 


[Pi]  =  ki*[xi,yi,l]  =>  [T]  * [Qi] 


e .  Example 


Given: 


f  =  .3048  meters  (12  inches) 


PI  = 

(-.071263, 

.029665) 

Ql  - 

(  -30, 

80) 

P2  =» 

(-.053033, 

-.006379) 

Q2  = 

(-100, 

-20) 

P3  =» 

(-.014063, 

.061579) 

Q3  =* 

(  140, 

50) 

P4  = 

(  .080120, 

-.030305) 

Q4  = 

(  -40, 

-240) 

| |  .000212 

.000236 

.000925  | | 

a)  [T]  '  =■  |  |  -.000368 

.000137 

.000534  | | 

| |  -.025404 

.021650 

.843879  |  | 

||  1117.14 

-2038.86 

o.o  1 1 

[ inv [T ]  ]'  -  |  j  3371.56 

2302.22 

-5.14991  || 

1 |  -51.0636 

-120.442 

1.31713  |  | 

b) 

[VLI] 

= 

[0,  -5.14991,  1.31713]' 

c) 

DI 

2* 

.255758 

d) 

9 

2* 

.872665  radians  (50  degrees) 

e) 

[VLO] 

= 

[  .000925,  .000534,  .843880]' 

f) 

[PPO] 

[-51.0636,  -120.442,  1.31713]' 

g) 

DO 

711.196 

h) 

$ 

=■ 

-.523599  radians  (-30  degrees) 

i) 

XSGN 

-1 

YSGN 

= 

-1 

j) 

DCP 

32 

544.8081 

XCP 

-400.202 

YCP 

= 

-300.117 

ZCP 

= 

350.196 
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Figure  f :  Failure  of  Least  Squares  (and  the  "throwing  out  the  worst 
residual"  heuristic),  to  deal  with  an  erroneous  data  point 


Problem:  Given  the  set  of  seven  (x,  y)  pairs  shown  in  the  plot, 

find  a  best  fit  line,  assuming  that  no  valid  datum 
deviates  from  this  line  by  more  than  0.8  units. 


5 

4 

3 

2 

1 

0 


Ideal  Model  Line 


Gross  Error  (Point  7) 


^Final  Least 
i  f  Squares  Line 


Point  x  y 

1  0  0 

2  1  1 

3  2  2 

4  3  2 

5  3  3 

6  4  4 

7  10  2 


Comment  Six  of  the  seven  points  are  valid  data  and  can  be  fit  by 
the  solid  line.  Using  Least  Squares  (and  the  "throwing 
out  the  worst  residual"  heuristic),  we  terminate  after 
four  iterations  with  four  remaining  points,  including 
the  gross  error  at  (10,  2)  fit  by  the  dashed  line. 


i  Successive 
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FIGURE  5  AN  EXAMPLE  SHOWING  FOUR  DISTINCT 
SOLUTIONS  TO  A  P3P  PROBLEM 


Consider  the  tetrahedron  in  Figure  5a.  The  base 
ABC  is  an  equilateral  triangle  and  the  "legs" 
(i.e.,  LA,  LB,  and  LC)  are  all  equal.  Therefore, 
the  three  face  angles  at  L  (i.e.,  <ALB,  <ALC,  and 
<BLC)  are  all  equal.  3y  the  law  of  cosines  ve 
have : 


Cos(Alpha)  »  5/8- 

This  tetrahedron  defines  one  solution  to  a  P3P 
problem.  A  second  solution  is  shown  in  Figure  5b. 
It  is  obtained  from  the  first  by  rotating  L  about 
BC.  It  is  necessary  to  verify  that  the  length  of 
L'A  can  be  1,  given  the  rigid  triangle  ABC  and  the 
angle  alpha.  From  the  law  of  cosines  we  have: 

2  2  2 

(2*sqrt(3))  -  4  +  (L'A)  -  2*4*(L’ A)*(  5/8 ) 

which  reduces  to: 

(L'A  -  t)  *  (L'A  -  4)  -  0. 

Therefore,  L'A  can  be  either  I  or  4.  Figure  5a 
illustrates  the  L'A  a  4  case  and  Figure  5b 
illustrates  the  L'A  =■  1  case. 

Notice  that  repositioning  the  base  triangle  so  that 
its  vertices  move  to  different  locations  on  the 
legs  is  equivalent  to  repositioning  L.  Figure  5c 
shows  the  position  of  the  base  triangle  that 
corresponds  to  the  second  solution. 

Since  the  tetrahedron  in  Figure  5a  is  threefold 
rotationally  symmetric,  two  more  solutions  can  be 
obtained  by  rotating  the  triangle  about  AB  and, AC. 
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FIGURE  6  AN  EXAMPLE  OF  A  P4P  PROBLEM 
WITH  TWO  SOLUTIONS 

Figure  6a  specifies  a  P4P  problem  and  demonstrates 
one  solution.  A  second  solution  can  be  achieved  by- 
rotating  the  base  about  BC  so  that  A  is  positioned 
at  a  different  point  on  its  leg  (see  Figure  6b).  To 
verify  that  this  is  a  valid  solution  consider  the 
plane  X  =  0,  which  is  normal  to  BC  and  contains  the 
points  L,  A,  and  D.  Figure  6c  shows  the  important 
features  in  this  plane.  The  cosine  of  alpha  is 
119/169.  A  rotation  of  beta  about  BC  repositions  A 
at  A'.  The  law  of  cosines  can  be  U3ed  to  verify  the 
position  of  A'  . 

To  complete  this  solution  it  is  necessary  to  verify 
that  the  rotated  position  of  D  is  on  LD.  Consider 
the  point  D'  in  Figure  6c.  It  is  at  the  same 
distance  from  P  as  D  is  and  by  the  law  of  cosines 
we  can  show  that  gamma  equals  beta.  Therefore,  D', 
which  is  on  LD,  is  the  rotated  position  of  D.  The 
points  A‘ ,  3,  C,  and  D'  form  the  second  solution  to 
the  problem. 
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FIGURE  7  AN  EXAMPLE  OF  A  P5P  PROBLEM 
WITH  TWO  SOLUTIONS 

This  example  ia  the  same  as  the  PdP  example  deaeiibed 
in  Figure  6  except  that  a  fifth  control  point,  E, 
has  been  added.  The  initial  position  for  E  and  its 
rotated  position,  E' ,  are  3hown  in  Figure  7.  The 
points  S  and  E’  were  constructed  to  be  the  mirror 
images  of  A*  and  A  about  the  line  LP;  therefore,  a 
rotation  of  alpha  about  P  repositions  E  at  E' .  One 
solution  of  the  P5P  problem  is  formed  by  points  A, 

B,  C,  and  D  (shown  in  Figure  6a)  plus  point  E.  The 
second  solution  is  formed  by  points  A',  B,  C,  D* , 
and  E‘ .  Consequently  there  are  two  different 
positions  of  L  such  that  all  five  points  lie  on 
their  appropriate  legs. 


FIGURE  B  GEOMETRY  FOR  AN  ITERATIVE 
SOLUTION  TO  THE  P3P  PROBLEM 
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FIGURE  9  COMPUTING  THE  3-D  LOCATION  OF  FIGURE  10  GEOMETRY  OF  THE  P4P  PROBLEM 

THE  CENTER  OF  PERSPECTIVE  (L)  (WITH  ALL  CONTROL  POINTS 

LYING  IN  A  COMMON  PLANE) 


